通过强制了解输入中某些转换保留输出的知识,通常应用数据增强来提高深度学习的性能。当前,使用的数据扩大是通过人类的努力和昂贵的交叉验证来选择的,这使得应用于新数据集很麻烦。我们开发了一种基于梯度的方便方法,用于在没有验证数据的情况下和在深度神经网络的培训期间选择数据增强。我们的方法依赖于措辞增强作为先前分布的不变性,并使用贝叶斯模型选择学习,该模型已被证明在高斯过程中起作用,但尚未用于深神经网络。我们提出了一个可区分的Kronecker因拉普拉斯(Laplace)近似与边际可能性的近似,作为我们的目标,可以在没有人类监督或验证数据的情况下优化。我们表明,我们的方法可以成功地恢复数据中存在的不断增长,这提高了图像数据集的概括和数据效率。
translated by 谷歌翻译
最近,深度学习中的不确定性估计已成为提高安全至关重要应用的可靠性和鲁棒性的关键领域。尽管有许多提出的方法要么关注距离感知模型的不确定性,要么是分布式检测的不确定性,要么是针对分布校准的输入依赖性标签不确定性,但这两种类型的不确定性通常都是必要的。在这项工作中,我们提出了用于共同建模模型和数据不确定性的HETSNGP方法。我们表明,我们提出的模型在这两种类型的不确定性之间提供了有利的组合,因此在包括CIFAR-100C,ImagEnet-C和Imagenet-A在内的一些具有挑战性的分发数据集上优于基线方法。此外,我们提出了HETSNGP Ensemble,这是我们方法的结合版本,该版本还对网络参数的不确定性进行建模,并优于其他集合基线。
translated by 谷歌翻译
我们提出了一种新颖的贝叶斯神经网络架构,可以通过推断在不同的权重方案上推断出后部分布来学习来自数据的修正。我们显示我们的模型在其他非不变的架构中占据了其他非不变架构,当培训在包含特定Infordces的数据集上培训。当未执行数据增强时,相同的保持真实。
translated by 谷歌翻译
包含数据增强的贝叶斯神经网络隐含地使用“无随机扰动的日志 - 似然,[哪个]没有作为有效的似然函数的干净解释''(Izmailov等,2021)。在这里,我们为开发具有数据增强的原则贝叶斯神经网络的方法提供了几种方法。我们介绍了一个“有限轨道”的设置,允许完全计算似然性,并在更常见的“完全轨道”设置中为更紧密的多样本限制。这些模型在寒冷后效应的起源上投射光线。特别是,我们发现甚至在包括数据增强的这些原则模型中仍然存在寒冷的后效。这表明冷的后效不能使用不正确的可能性作为数据增强的伪影。
translated by 谷歌翻译
Charisma is considered as one's ability to attract and potentially also influence others. Clearly, there can be considerable interest from an artificial intelligence's (AI) perspective to provide it with such skill. Beyond, a plethora of use cases opens up for computational measurement of human charisma, such as for tutoring humans in the acquisition of charisma, mediating human-to-human conversation, or identifying charismatic individuals in big social data. A number of models exist that base charisma on various dimensions, often following the idea that charisma is given if someone could and would help others. Examples include influence (could help) and affability (would help) in scientific studies or power (could help), presence, and warmth (both would help) as a popular concept. Modelling high levels in these dimensions for humanoid robots or virtual agents, seems accomplishable. Beyond, also automatic measurement appears quite feasible with the recent advances in the related fields of Affective Computing and Social Signal Processing. Here, we, thereforem present a blueprint for building machines that can appear charismatic, but also analyse the charisma of others. To this end, we first provide the psychological perspective including different models of charisma and behavioural cues of it. We then switch to conversational charisma in spoken language as an exemplary modality that is essential for human-human and human-computer conversations. The computational perspective then deals with the recognition and generation of charismatic behaviour by AI. This includes an overview of the state of play in the field and the aforementioned blueprint. We then name exemplary use cases of computational charismatic skills before switching to ethical aspects and concluding this overview and perspective on building charisma-enabled AI.
translated by 谷歌翻译
There are two important things in science: (A) Finding answers to given questions, and (B) Coming up with good questions. Our artificial scientists not only learn to answer given questions, but also continually invent new questions, by proposing hypotheses to be verified or falsified through potentially complex and time-consuming experiments, including thought experiments akin to those of mathematicians. While an artificial scientist expands its knowledge, it remains biased towards the simplest, least costly experiments that still have surprising outcomes, until they become boring. We present an empirical analysis of the automatic generation of interesting experiments. In the first setting, we investigate self-invented experiments in a reinforcement-providing environment and show that they lead to effective exploration. In the second setting, pure thought experiments are implemented as the weights of recurrent neural networks generated by a neural experiment generator. Initially interesting thought experiments may become boring over time.
translated by 谷歌翻译
Recent advances in deep learning have enabled us to address the curse of dimensionality (COD) by solving problems in higher dimensions. A subset of such approaches of addressing the COD has led us to solving high-dimensional PDEs. This has resulted in opening doors to solving a variety of real-world problems ranging from mathematical finance to stochastic control for industrial applications. Although feasible, these deep learning methods are still constrained by training time and memory. Tackling these shortcomings, Tensor Neural Networks (TNN) demonstrate that they can provide significant parameter savings while attaining the same accuracy as compared to the classical Dense Neural Network (DNN). In addition, we also show how TNN can be trained faster than DNN for the same accuracy. Besides TNN, we also introduce Tensor Network Initializer (TNN Init), a weight initialization scheme that leads to faster convergence with smaller variance for an equivalent parameter count as compared to a DNN. We benchmark TNN and TNN Init by applying them to solve the parabolic PDE associated with the Heston model, which is widely used in financial pricing theory.
translated by 谷歌翻译
A statistical ensemble of neural networks can be described in terms of a quantum field theory (NN-QFT correspondence). The infinite-width limit is mapped to a free field theory, while finite N corrections are mapped to interactions. After reviewing the correspondence, we will describe how to implement renormalization in this context and discuss preliminary numerical results for translation-invariant kernels. A major outcome is that changing the standard deviation of the neural network weight distribution corresponds to a renormalization flow in the space of networks.
translated by 谷歌翻译
We present an automatic method for annotating images of indoor scenes with the CAD models of the objects by relying on RGB-D scans. Through a visual evaluation by 3D experts, we show that our method retrieves annotations that are at least as accurate as manual annotations, and can thus be used as ground truth without the burden of manually annotating 3D data. We do this using an analysis-by-synthesis approach, which compares renderings of the CAD models with the captured scene. We introduce a 'cloning procedure' that identifies objects that have the same geometry, to annotate these objects with the same CAD models. This allows us to obtain complete annotations for the ScanNet dataset and the recent ARKitScenes dataset.
translated by 谷歌翻译
This article presents a novel review of Active SLAM (A-SLAM) research conducted in the last decade. We discuss the formulation, application, and methodology applied in A-SLAM for trajectory generation and control action selection using information theory based approaches. Our extensive qualitative and quantitative analysis highlights the approaches, scenarios, configurations, types of robots, sensor types, dataset usage, and path planning approaches of A-SLAM research. We conclude by presenting the limitations and proposing future research possibilities. We believe that this survey will be helpful to researchers in understanding the various methods and techniques applied to A-SLAM formulation.
translated by 谷歌翻译